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Tables Year Exam| nally y eatures and__| Advantages and 
frequency /tally level in eae wh Ww mm 48 jeneral advice | Disadvantages 
table Australi iteetin 
an 
curriculu walk HH HH 21 
m bus HH HH 20 
Frequency table | 2, 3, 4, boat/ferry | i Tally marks 
5, 6 bicycle III 4 Frequency 
For categorical or tram 0 column 
discrete numerical skateboard Total e Useful for small 
data etc. I 2 Include all datasets. 
DoE) Categories = * Quick recording 
Type of Travel to School even where of frequency 
there is a 0 using tally 
result marks. 
e Pie graphs, x 
bar charts 
and * Retains 
histograms all individual data 
need data ina| Values 
table before 
they can be 
created in 
Excel 
Grouped Height ¢ Tally marks 
frequency table | (not (cm) tally frequency Frequency 
specified) 130-< column ¢ Quick recording 
For numerical data 140 0 Total for large data 
e Include all sets with a wide 
ey oda categories spread. 
150 | 1 even where x 
150-< ea a0 * Loses individual 
160 WMI 4 choose data values. 
160-< intervals in 
170 He 15 5s, 10s, etc. 
aE Limit number 
ply oy ae |g a aaa aan of intervals to 
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III 8-10 
mn i ww a a Be ad 
180 -< unambiguous 
190 | 31 interval labels 
e.g. 140 = 
190-<200|#H HH 10 150 
2 Saas ES 0 Pe Se 
Height of Students ae 
e.g. 140 - 150 
or 140 >149 
Picture Year Example of Chart Features and Advantages and 
graphs level general advice | Disadvantages 
pictographs 
One to one 
correspondence | 2, 3 ¢ Graphics 
Mode of Travel to School should be e Visually 
Mainly used for drawn to appealing 
categorical data scale where ¢ Useful for 
possible comparison of 
e Include all small data sets 
categories e No need for 
even where frequency axis 
there is a O x 
bicycle Boat/Ferry “Meaiiiata Walk result e Need to cou nt 
for exact total 
e Potentially 
misleading if 
pictures not 
the same scale 
Many to one 
correspondence | 4, 6 e Graphics 
should be ° Quicker for 
seg drawnto | ee 
categorical data 
scale where © No need for 
possible frequency axis 


e Include all 


A% 
We ji 
1 chy} 
we 
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categories x 
Mode of Travel to School even where 
there is a 0 ¢ Hard to 
Key: result quantify partial 
represents 10 students e Ke pictures 
€ € : ¢ Potentially 
= misleading if 
& & = pene not | 
the same scale 
ew & & <€F 
bicycle Boat/Ferry Car Skateboard Walk 
etc 
Bar Graphs Year Example of Chart Features and Advantages and 
vertical/column or | level general advice | Disadvantages 
horizontal bar 
Bar chart ¢ Proportional 
Used for 3, 4, 5 Mode of Travel to School Mode of Travel to School columns 
categorical and eee e Separate e Useful for data 
discrete i Pea columns comparison 
ungrouped : — e Columns are 4 
numerical data am equal width 
: ao . » » so # separated by | ° Canbe 
Horizontal bar cuca A equal gap misleading if 
chart Useful when © Title and axes scale does not 
the category labels begin at 0 
names are long « Show units if * Tedious to use if 
weed many variables 
e Key if 
necessary 
Side by side ¢ Proportional 
column Graphs 2 | 6 columns 
or more attributes * Columns are | * Useful for 
for each variable equal width comparison of 
and separate percentages 
¢ Column groups x 
are separated 5 Gaahe 


by equal gap 
e Title and axes 


misleading if 


» Ah 
& me 
BeSy 

if Oy} 
or 
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Mode of Travel to School 


oh | i a 


boat/ferry bicycle skate etc 


car 


walk bus 
Type of Transport 


B females @ males 


labels 

e Show units if 
used 

e Key if 
necessary 


scale does not 
begin at 0 or 
sample sizes are 
unequal 
e Tedious to use 
if many 
variables 


Stacked bar (not ¢ Proportional 
chart specified) ; columns 
For 2 or more Favouite take - Away Food * Separate e Limited use for 
attributes columns comparing few 
compared among * Columns are as as 
@ Pizza/Pasta j (0) 
2 or more m Kebabs/Wraps equal width 
categories m Hamburgers e Equal width x 
pas gaps — 
“er e Difficult to 
hia : Title an : : 
Chicken (e.g.8BQ chicken) . Eee d axes display if there 
° Key are many 
* Show units if variables 
sad e Hard to 
compare ‘like 
with like’ 
Dot Plots Year Example of Chart Features and Advantages and 
level general advice | Disadvantages 
One to one 5,7 ¢ Proportional 
nee 10 Mode of Travel to School for Boys dots ¢ Quick for small 
- Or aA c ¢ Title and axes quantities 
categorical and ompare labele oie eee eo 
discrete numerical | shapes of o need tor | 
data. boxplots O O frequency axis 
to O oO e Easy to geta 
correspo | O O O visual sense of 
die n g | bus boat/ferry bicycle ea etc “ay pe Xx i 
istogra 


AS 
Le i 
71 hy} 
wn 
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ms and e Need to count 
dot plots for exact total 
Many to one 6 ¢ Proportional 
correspondence dots 
Used for Mode of Travel to School for Girls e Title and axes | * Quick to 
categorical and o labels construct 
discrete numerical @ is sce ° Key * No need for — 
data. ‘e) frequency axis 
oO ° Useful for large 
NB Can use O quantities 
crosses etc. O x 
c boat/ferry bicycle skate etc e Can be hard to 
quantify part 
dots 
Pie Graphs Year Features and Advantages and 
level general advice | Disadvantages 
(6 ¢ Title 
Used for Elaborati Eye Colour NOTE: Yr 6 e Clear labels 
categorical and on) Other Grey (Elaboration) and ¢ Useful to 
discrete numerical “identifying proportional compare parts 
data potentially sectors to the whole 
misleading data ¢ Key if x 
representations such necessary Requires chill t6 
as...pie charts in ¢ % or number q 
which the whole pie labels draw accurately 
does not represent ¢ Total * Not useful for 
eR eee rrr * Segments large number of 
usually categories 
ordered by size | ° 
Stem and Year Example of Chart Features and Advantages and 
Leaf Plots level general advice | Disadvantages 
Single 7 Belly button Heights ¢ Ordered data 


Used for discrete 
and continuous 
numerical data. 


(it is usual to 
complete an 
unordered plot 
as a first step) 


¢ Quick to draw 

e Ordered so 
shows 
distribution 
shape 
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KEY 3|4 represents 34 


e Title 

e Key 

e All stems in 
the range must 
be included 
even if there is 
no leaf 


e Useful display to 
identify median 
(and quartiles) 


X 


e Data must first 
be ordered 


Back to back 9 Belly button Heights e As above 
Used for discrete “Describe ¢ Note: care 
and continuous data aeede to be ¢ Ordered so 
numerical data. using a shows shape of 
terms wise distribution 
including finding ° Useful for 
‘skewed’, median, Qi and comparison 
‘symmetri Q; on left x 
c’, and hand leaves. 
‘bi modal’ Students need | ° niet eae 
KEY 3|4 represents 34 to be taught to |e see note at left 
always read 
leaves from 
the stem out. 
Split stems (not Dominant Hand Reaction Time e For two stems 
Used for discrete specified) use: 
and continuous > 1.3 ¢ Ordered so 
numerical data. 2* |5, 6 shows shape of 


o 
oe 
OQVMOYNRAQn RE QwWYn = 3/5 


Key 3/1 represents 3.1 


Key 2*[5 rep 2.5 


e For five stems 


distribution 

e Useful to show 
distribution of 
quantities with a 
small range 

e.g. birth weights 


X 


e Data must first 
be ordered 


all 
os. pe 
EES 


chiny 
TAU) 
ae 
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Key 2*|8 rep 2.8 


160 
Height (cm) 


170 


e Limit number 
of intervals to 
8-10 


Histog rams Year Example of Chart Features and Advantages and 
level general advice | Disadvantages 
Used for 9 e Columns 
ungrouped Describe touch Shows hae 
discrete and data H e Category . 
continuous using Hours slept on a school night labels are mid | and spread of 
numerical terms fF 55 column for distribution 
including discrete x 
‘skewed’, ungrouped 
symmetri data ¢ Small data sets 
c’, and only 
‘bi modal’ Note: grouped 
numerical data 
10 can be displayed 
Compare eo ee ae dd de in a histogram 
shapes of Hours slept with interval 
boxplots names placed 
to below the 
correspo interval marker 
nd-ing e.g. 
histogra | | | | | 
ms and 10 - 20 - 30 - 40 - 50- 
dot plots 
Grouped numerical | (not e Category 
specified) labels at 
5 beginning of e Useful when 
Height of Students each column data has a 
for grouped large range. 
data 
¢ Choose 4 
intervals in 
5s, 10s, etc. | ° Loss of 


individual data 
values 
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Box Plot Year Example of Chart Features and Advantages and 
Box and whisker level general advice | Disadvantages 
plot 
Single box plot 10 e Need to 
Com pare Box-whisker plot of RighFoot discuss 
Used for shapes of outliers and e Shows shape 
categorical and boxplots whether or and spread of 
discrete numerical | to not to each quarter of 
data. correspo eliminate the distribution 
nd-ing them from e Can identify 
histogra the data median, IQR 
ms and * Can identify and range 
dot plots possible and easily 
probable 
outliers using: x 
Outlier (x) is 
RighFoot any value * Loses 
beyond the individual data 
fences where values 
the fences are 
located at 
Q3+1.5x 
IQR and 
Q1-1.5x IQR 
Parallel box 10 


plots 


Used to compare 
the distribution of 
two numerical data 
sets 


e Single axis 
used for 
multiple box 
plots 


e Useful for 
comparison 

e Can compare 
shapes of 
distributions 


X 


e Unable to 
determine 
exact values 


tes 


Ere 


71 chy 
we 


A 
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Box-whisker plot of RighFoot Selon 


RighFoot 


AU Multi-state 2010 


Sample A 
Sample B 


Sex = Female: RighFoot = 15 to 40 
Sex = Female: RighFoot = 15 to 45 


Scatter plots | Year Example of Chart Features and Advantages 
level general advice | and 
Disadvantages 
A bivariate 10 ¢ Title and axes 
display for . : labels 
: Height vs Belly Button Height ae: 
numerical data * Show units if 2 ase 
+ used ascertain the 
Relationship can A a. : oy relationship (if 
be negative or : pe, ain ‘ eet saree ae saline 
op i : wo variables 
positive, weak, 2 . vertical axis 
strong or none, 3 * Independent x 
sisted or non . (explanatory) © Outliers will 
inear Height (om) variable on the affect 
horizontal axis relationship 
e Choose scale 
that gives the 
best view 
Independent 10 e Title and axes 


variable is time 


labels 
e Show units if 


e Used to look 


SF i 
Gy, 

Ten 

od 
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Australian Historical Population - 
20 - 24 year olds by Sex, 1955 - 2005 


800,000 
700,000 


600,000 


400,000 
300,000 
200,000 


1950 1960 1970 1980 1990 2000 2010 


Year 


used 

e Show key if 
necessary 

e Time is the 
independent 
variable 

e Choose scale 
that gives the 
best view 


for trend over 
time 


X 


Fluctuations 
can make 
trend difficult 
to see 


Straight line of 10A e Dependent 
best fit : . (response) 
(linear trend line) Bent ye Rony Sule eee variable on the Useful for 
y= 0.634x- 2.4567 @ vertical axis making 
$ ¢ Equation should predictions 
3 be interpreted Useful for 
: Heal the seeing 
a * Gradient and y relationships 
160 intercept can 
Height (cm) have meaning X 
Extreme values 
Belly button height = 0.634 x height + 2.457 cm will affect 
reliability 
Extrapolation 
less reliable 
than 
interpolation 
Summary Year Used for continuous and discrete numerical data | Features and general advice 
statistics evel 
e Measures of 7 Numerical: 
Centre: Includes e Median: middle of ordered data ¢ Median: for data with outliers 
median, mean, | “locating e Mean: sum of data divided by the number of ¢ Mean: for data with reasonably 
mode mean, data values symmetric distribution (no outliers) 
median Categorical: ¢ Mode: for categorical data 
and range e Mode: most frequently occurring item 
on graphs 
e Measures of and =—_—s+| Numerical 
Spread: range | COnnectin e Range 


g them to 
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Outliers: effect 
on mean and 


Description of 
shape: 
skewed, 
symmetric, bi 
modal 


Measures of 
Spread: range, 
interquartile 
range, 5 
number 
summary 


Measures of 
Spread: mean 
and standard 
deviation 


real life”) 


10 
Compare 
shapes of 
boxplots 
to 
correspo 
nd-ing 
histogra 
ms and 
dot plots 


interpret 
mean 


Height of Students 2 


3 
6 | 
wo 
3 
6 
4 
al 
ad 


199 200 


120 130 +140 +150 160 170 180 
Height (cm) 


Height of Students 3 


Height of Students 4 


16 ; 
14 . 
2 - 
10 J | 
8 | 
= — 
; ; 
= | 
° ul I 


220=«:130 140 150 160 170 180 190 200 210 
Height (cm) 


Numerical 
e Range max - min 
e  Interquartile range (IQR)Q3- Qi 
¢ min, Qi,median, Q3, max 


For a normal distribution 

68% of observed values fall within 1 standard deviation of 
the mean, 

95% of observed values fall within 2 standard deviations 


and 


e Care needs to be taken when 
deciding whether or not to discard 
— UTI esau nem cnsnersnintanunh abasic 
e For skewed distributions the mean 
will be drawn towards the tail. 

e Median is a more accurate measure 
of centre for skewed distributions. 

¢ For symmetric distributions the 
mean and median will be similar. 

¢ a bi modal distribution can indicate 
data has been collected from 2 
distinct populations 


e|QR used as range when data has 
outliers 


=) 
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standard 
deviation 


of the mean, 
99.7% of observed values fall within 3 standard 
deviations of the mean 


Some general notes on making charts 


Charts convey quick visual information about a distribution. This is more obvious when diagrams use a scale so comparative 
integrity can be assumed. Charts in 2D are more accurately read than those in 3D. Graphs should: 

e always show chart title, axes labels and provide a key when necessary 

e use a scale whenever possible 

e be shown in 2D rather than 3D 


Also: 


e (Year 6 Elaboration) Beware of graphs that are “...potentially misleading ...such as...with ‘broken’ axes, non-linear scales...” 
e From Year 3 “Create displays....with and without the use of digital technologies” 


Glossary 


Bar graph 


Note: (A) indicates definition from the ACARA Glossary 


(See also column graph) In a bar graph or chart, the bars can be either vertical or horizontal. (A) 


Categorical data 


f 


A categorical variable is a variable whose values are categories. Categories may have numerical labels, for example, 
or the variable postcode the 


category labels would be numbers like 3787, 5623, 2016, etc, but these labels have no numerical significance. (A) 


Column graph 


joy Co} 


A column graph is a graph used in statistics for organising and displaying categorical data. To construct a column 
raph, equal width rectangular 

bars are constructed with height equal to the observed frequency. Column graphs are frequently called bar graphs or 
ar charts. (A) 


Continuous data 


t 


A continuous variable is a numerical variable that can take any value that lies within an interval. In practice, the values 
aken are subject to 
the accuracy of the measurement instrument used to obtain these values. (A) 


Data Data is a general term for a set of observations and measurements collected during any type of systematic 
imvestigation. 
Primary data is data collected by the user. Secondary data is data collected by others. Sources of secondary data 
imclude, web-based data 
sets, the media, books, scientific papers, etc. (A) 
Data display A data display is a visual format for organising information (e.g. graphs, frequency tables) (A) 
Dependent A dependent variable (response variable) is one whose value depends on the value of another variable. E.g. height 
ariable depends on age 
Discrete A discrete numerical variable is a numerical variable, each of whose possible values is separated from the next by a 
umerical definite 'gap’. The most 
variable common numerical variables have the counting numbers 0,1,2,3,... aS possible values. Others are prices, measured in 
dollars and cents. 
Examples include the number of children in a family or the number of days in a month. (A) 
Distribution The pattern of variation of a variable 
Dot plot A dot plot is a chart where each data point is represented as a dot on a number line. Dots can represent more than one 
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observation. 
Independent An independent variable (explanatory variable) is one whose value does not depend on the value of another variable. 
ariable 
Mean The arithmetic mean of a list of numbers is the sum of the data values divided by the number of numbers in the list. 
(A) 
Median The median is the value in a set of ordered data that divides the data into two parts. It is frequently called the 'middle 
value’. Where the number of 
observations is odd, the median is the middle value. Where the number of observations is even, the median is 
calculated as the mean of the two 
central values. (A) 
Mode The mode is the most frequently occurring value in a set of data. When there are two modes, the data set is said to be 


bimodal. (A) 


Numerical data 


> 


Can be discrete, data can take specified values only; or continuous, data can take any value within a range. Also see 
lote above in ‘Categorical data’ 


Picture graph 


A graph that use pictures to represent the frequency of categorical data. Each picture can represent one or more 


pieces of data. 


Stem and leaf 
lots 


{e) 


< 


Stem and leaf plots are tables where discrete data e.g. the set of students’ height in cms, is represented (usually in 
rder) by distinguishing values 

(the leaf) within set intervals (the stem). Stem plots must include a key e.g. Key: 15|2 = 152 cms. Stem plots provide a 
isual indication of spread. 


Variable 


Any characteristic of a person or thing. Univariate data has only one attribute e.g. eye colour. Bivariate data has two 
ttributes e.g. in a scatterplot a 
single point can represent both height and age. 


